Markov decision process

Results: 537



#Item
61Stat 260/CSLearning in Sequential Decision Problems. Peter Bartlett 1. Recall: MDPs. 2. Value iteration. 3. Policy iteration.

Stat 260/CSLearning in Sequential Decision Problems. Peter Bartlett 1. Recall: MDPs. 2. Value iteration. 3. Policy iteration.

Add to Reading List

Source URL: www.stat.berkeley.edu

Language: English - Date: 2014-11-25 12:45:38
62Classification-based Policy Iteration with a Critic V. Gabillon1 , A. Lazaric1 , M. Ghavamzadeh1 & B. Scherrer2 1 2  INRIA Lille - Nord Europe, Team Sequel,

Classification-based Policy Iteration with a Critic V. Gabillon1 , A. Lazaric1 , M. Ghavamzadeh1 & B. Scherrer2 1 2 INRIA Lille - Nord Europe, Team Sequel,

Add to Reading List

Source URL: victorgabillon.nfshost.com

Language: English - Date: 2011-06-30 11:49:57
63LETTER  doi:nature14236 Human-level control through deep reinforcement learning

LETTER doi:nature14236 Human-level control through deep reinforcement learning

Add to Reading List

Source URL: storage.googleapis.com

Language: English - Date: 2016-01-26 06:53:21
64MDP Cheatsheet Reference Author: John Schulman (F) = facts that are a bit more technical 1  Markov Decision Process

MDP Cheatsheet Reference Author: John Schulman (F) = facts that are a bit more technical 1 Markov Decision Process

Add to Reading List

Source URL: rll.berkeley.edu

Language: English - Date: 2016-01-25 13:14:56
    65Stat 260/CSLearning in Sequential Decision Problems. Peter Bartlett 1. Markov decision processes and partially observable Markov decision processes. 2. Value functions, Q functions.

    Stat 260/CSLearning in Sequential Decision Problems. Peter Bartlett 1. Markov decision processes and partially observable Markov decision processes. 2. Value functions, Q functions.

    Add to Reading List

    Source URL: www.stat.berkeley.edu

    Language: English - Date: 2014-11-25 12:45:37
    66arXiv:1402.6763v1 [math.OC] 27 FebLinear Programming for Large-Scale Markov Decision Problems Yasin Abbasi-Yadkori Queensland University of Technology

    arXiv:1402.6763v1 [math.OC] 27 FebLinear Programming for Large-Scale Markov Decision Problems Yasin Abbasi-Yadkori Queensland University of Technology

    Add to Reading List

    Source URL: arxiv.org

    Language: English - Date: 2014-02-27 20:30:05
    67Deterministic MDPs with Adversarial Rewards and Bandit Feedback  Raman Arora TTIC 6045 S. Kenwood Ave. Chicago, IL 60637, USA

    Deterministic MDPs with Adversarial Rewards and Bandit Feedback Raman Arora TTIC 6045 S. Kenwood Ave. Chicago, IL 60637, USA

    Add to Reading List

    Source URL: dept.stat.lsa.umich.edu

    Language: English - Date: 2012-09-12 18:50:24
    68Rollout Allocation Strategies for Classification-based Policy Iteration  Victor Gabillon  Alessandro Lazaric

    Rollout Allocation Strategies for Classification-based Policy Iteration Victor Gabillon Alessandro Lazaric

    Add to Reading List

    Source URL: victorgabillon.nfshost.com

    Language: English - Date: 2010-07-01 09:47:14
    69Playing Atari with Deep Reinforcement Learning  Volodymyr Mnih Koray Kavukcuoglu

    Playing Atari with Deep Reinforcement Learning Volodymyr Mnih Koray Kavukcuoglu

    Add to Reading List

    Source URL: arxiv.org

    Language: English - Date: 2013-12-19 20:23:45
    70approximate-mdps-notes.dvi

    approximate-mdps-notes.dvi

    Add to Reading List

    Source URL: www.stat.berkeley.edu

    Language: English - Date: 2014-11-25 12:45:37